[CUDA] support sorting complex numbers#3286
Merged
zcbenz merged 2 commits intoml-explore:mainfrom Mar 25, 2026
Merged
Conversation
Add complex64 sort, argsort, partition, and argpartition support to the CUDA backend
There was a problem hiding this comment.
Pull request overview
Adds CUDA backend support for sorting complex64 inputs by removing the previous complex64 exclusion in the CUDA merge-sort implementation and extending NaN-handling for complex types.
Changes:
- Enable CUDA merge-sort kernels for complex64 by removing the complex64 guard in both single-block and multi-block paths.
- Add NaN initialization and NaN-aware comparisons for complex64 within CUDA sort comparator logic.
- Extend Python sort/argsort test coverage to include complex64 and add a complex NaN sort assertion.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
python/tests/test_ops.py |
Expands sort/argsort tests to include complex64 and adds a complex NaN sort check. |
mlx/backend/cuda/sort.cu |
Removes the complex64 exclusion and adds complex-aware NaN init/comparison support so complex64 can be sorted on CUDA. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Contributor
Author
|
sry I forgot to check Metal's complex support. I will fix it later. |
Contributor
Author
|
fixed |
zcbenz
approved these changes
Mar 23, 2026
Collaborator
zcbenz
left a comment
There was a problem hiding this comment.
Looks good to me, thanks!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Proposed changes
Add complex64 sort, argsort, partition, and argpartition support to the CUDA backend
Summary
Remove the
complex64_texclusion from the CUDA merge sort kernels. The current merge sort is comparison-based and already supports complex64. The guard was stale.Checklist
Put an
xin the boxes that apply.pre-commit run --all-filesto format my code / installed pre-commit prior to committing changes